2 


es 


CATALOGING QUALITY, 
PRIORITIES, AND MOD! 
OF THE LIBRARY'S F UTURE 


*) 


OPINION PAPERS 


NO. 1 


Cataloging Quality, LC Priorities, and 
Models of the Library’s Future 


Thomas Mann 


\ibeari 
. —— . Divisi 


Library of Congress, Cataloging Forum 
Washington & 1991 


Library of Congress Cataloging-in-Publication Data 


Mann, Thomas, 1948- 
Cataloging quality, LC priorities, and models of the 
library's future / Thomas Mann. 


21 p. ; 28 cm. -- (Opinion papers ; no. 1) 


1. Cataloging--Government policy--United States. 
2. Cataloging--Data processing-Standards--United States. 
3. Cataloging--United States--Quality control. 4. Cataloging 
--United States--Forecasting. 5. Information retrieval. 
6. Communication models. 7. Library of Congress. I. Title. 
II. Series: Opinion papers ; no. 1. 
Z693.5.U6M36 1991 90-026947 
025.5°24--dc20 CIP 


This is one in a series of occasional papers devoted to 
cataloging policy and practice. The opinions expressed 
are the authors’ and not necessarily those of the 
Cataloging Forum. 


Copies of this publication are available from members of the 
Cataloging Forum Steering Committee 


i 


CATALOGING QUALITY, LC PRIORITIES, AND MODELS OF THE 
LIBRARY’S FUTURE 


Thomas Mann 


Reference librarians love catalogers, and with good reason. Without catalogers, we cannot do 
our job. Just to establish a baseline on the importance of quality LC cataloging, let me offer six examples 
of what I mean: 


l) 


2) 


3) 


A professor who is the president of a classics association, and who has been teaching (and 
writing) in the field for twenty years, asked me to find out "How would ancient Greeks and 
Romans have transcribed animal noises?"--i.e., today we would write "Quack" or “Oink” or 
“Ribbit,” but how would the ancients have heard these sounds? He didn’t have a clue as to 
how to find such information. Fortunately, the Library of Congress Subject Headings 
(LCSH) ‘ist provides a number of likely category terms for such information; and, equally 
fortunate, dozens of commercial indexes tend to use the same subject terms that we devise 
here. “Animal sounds” esta Ay A haad yb sone uly Acar 8 ge dh 
turn up a promising article in the Socia 's and Humanities Index (which uses LCSH) 
on “Suetonius’ steam ef eal aaah? Yet another LCSH category, "Greek 
language--Onomatopoeic words,” turned up a 439-page dictionary, in Greek, that hit the nail 
right on the head. (And I don’t even read Greek--but that’s one of the points: I do know 
how to use LCSH, and because of that I can find things much more efficiently than even full 
professors who've spent a lifetime in a field that is literally "Greek to me.") 


In this case, too, I had not known in advance that the subdivision "--Onomatopoeic words" 
existed; but thanks to the precoordinated display of subdivisions under the LCSH heading 
in the catalog, I could recognize it when I saw it. 


A fiction writer told me she needed background books on "What life is like for people who 
live along highways out in the country, not near cities." A question like this is impossible to 
research with key word searching, as that technique will produce thousands of irrelevant hits. 
There have to be category terms to work with. I asked if she meant something like people 
who lived along the old Route 66, and she said yes. We typed in "Route 66" and found four 
iitles--but from the tracings on these titles (and on other works to which they led) I finally 
made a set composed of the LCSH elements "(Automobiles or Roads) and (Road guides or 
Guidebooks),” then limited it with the geographic area code ("gac e n-us."); and the woman 
was simply amazed. The results gave her, she said, dozens of titles that looked like they were 
right on the button; and she hadn’t really expected to find any that were very close to what 
she wanted. 


A student wanted informaiion on “non-military relations of the U.S.S.R. with other 
countries." We quickly found the LCSH category "Soviet Union--Relations" (after noting 
that the subdivisions "Foreign relations" and "Foreign economic relations” were not turning 
up what she wanted), and then discovered a host of other subjects through cross-references 
and tracings (all to be combined with "Soviet Union"): 


Cataloging Quality, LC Priorities, etc. page 1 


page 2 


4) 


Teachers, interchange of 
Educational exchanges 
Exchange of persons programs 
Cultural relations 

Students, interchange of 


As is evident, the more we got into it, the more the student thought she’d concentrate on 
educational exchanges, so I showed her Education Index, too. This is another commercial 
index that depends on LC for its own category terms (e.g., all of the above headings can be 
found in it)--and it quickly led to relevant journal articles not in LOCIS. 


A few months ago, just before Mrs. Gorbachev’s visit, the Librarian’s office needed to find 
the date of the construction of the memorial to the millenium of the founding of Russia, in 
Novgorod, because we wanted to give Mrs. Gorbachev a picture of the memorial. The 
question was eventually relayed to me. I knew from past experience that a good way to find 
the date of any such memorial is to first find a guidebook to the city in question, and that 
such guides can be found predictably through the "--Description" subheading attached to the 
name of the city. In looking under Novgorod with this subdivision, I found that the 

i themselves were clustered in the class DK651 area of the stacks. Arriving there, 
I quickly browsed through about three shelves of books--and I could tell from the 
illustrations that one booklet, in Russian, was precisely on the subject of the millenium 
memorial. Since I can’t read Russian myself, I immediately took the booklet to a specialist 
in the European Division, who then quickly provided the necessary date to the person who 
had asked me the question in the first place. 


I checked later and found that the catalog record for the memorial booklet did not have the 
"--Description" subject heading itself. Nevertheless, even though the catalog hadn’t shown 
me the record for that book to start with, I could still find the book anyway because the 
second system, the class scheme’s arrangement of full texts, "kicked in" as a backup and 
revealed to me something that I didn’t know enough about to specify in advance in the 
catalog. Because of the functioning of both systems, in other words, someone who knows 
nothing about Russian history, who doesn’t even read the language, and who had 535 miles 
of books to search in (the distance from Washington to Charleston, S.C.), could still find the 
answer to an inquiry on a very obscure point within 15 minutes of being asked the question. 


For a biographer who needed information on society’s attitudes toward traveling saleswomen 
in the early years of the century, I had to browse through many volumes in the class HF5541 
area of the stacks (which I could identify as the right area in the first place through the 
subject headings “Commercial travelers" and "Traveling sales personnel" in the catalog). I 
eventually found a few articles that were right on target in a couple runs of old magazines 
published for salesmen at the time, Commercial Travelers Magazine and Sample Case. The 
catalog alone--whether searched by LCSH or key words or classification numbers--could not 
indicate which of the hundreds of volumes in class HF5541 had the precise information I 
needed. It could only get me to the right general area of the stacks. Only systematic 
browsing of full texts could turn up the desired specific information that was contained within 
them. And I never would have found these full texts in the first place if they hadn’t been 
arranged in a way that allowed for systematic subject browsing. 


Cataloging Quality, LC Priorities, etc. 


6) For a question on the Civil War ironclad vessel Barataria, | found numerous references to 
the ship in the E591 area of the stacks, which generally corresponds to the catalog heading 
"United States--History--Civil War, 1861-65--Naval operations.” The catalog itself, however-- 
even when searched by key words--contained no specific references to the Barataria that led 
to precise pages in any of the voluminous E591 material. I simply had to browse full texts 
directly, in a predictably limited subject group created by the classification scheme. The 
subject heading in the catalog, however, had still done its job in serving as the index to the 
class scheme, telling me exactly which area to go to. 


From all of these examples, let me anticipate a point I'll return to later: that if the works which 
provided the answers to these questions had received only Minimal Level Cataloging (MLC), they would 
never have been found. Key word access is primarily useful when the records being searched already 
have LC subject headings attached to them by professional catalogers, and when the records point to 
areas in the bookstacks where full texts are arranged in subject-browsable groups. If only the brief 
catalog records themselves were searchable, and if only their uncontrolled (i.e., key-word variant) titles 
could be searched for subject information, then these questions (and hundreds of other such inquiries 
that we get every year) simply could not be answered. And they could not be answered in spite of the 
fact that the works containing the needed information are in LC’s collections. (Nor are computer- 
searchable authors’ names of much help in such cases--for the vast majority of inquiries there is simply 
no way we can know in advance which authors have addressed the subject we’re being asked about.) 


Having predictable category terms--and systematic avenues of access (i.e., cross-references and 
tracings) enabling me to find the proper category terms--was simply essential in the first four examples. 
And for the fourth through the sixth, a classification scheme that enabled me to search full texts in a 
systematic manner was indispensable--as was the index function of the subject heading system, that told 
me which areas of full texts would be most relevant. Had the journal Sample Case been given minimal 
level cataloging and removed from the class scheme, I never would have found it. Who would ever think 
to search under the words "sample case" for an inquiry like that? And with Barataria, I have very little 
patience with people who tell me that the full text search capability that will "eventually" be available 
online will solve my problem, for who is going to enter the full text of hundreds of old Civil War volumes 
in a way that allows component word searching?--i.e., through manual keying or OCR-scanning, rather 
than the raster-scanning that LC does, which only takes a “picture” of a whole page and doesn’t permit 
key word searches. How will the entry of such data, for which there is so little market, be economically 
feasible either to load or to search? And what are reference librarians supposed to do in the meantime? 


The point here is simple, although the problem is apparently that the point is not obvious: 
Neither I nor any other reference librarian can even begin to have subject expertise in all of the areas in 
which we'll receive queries. But in the large majority of cases we do not need subject expertise as long as 
we have predictable systems of access. And the vocabulary control of LCSH and the groupings of full texts 
brought about by the LC Classification scheme constitute those systems. As long as we have good 
category terms to work with, and functional avenues of access to the category terms provided by cross- 
references and tracings, and full texts (not just brief catalog records) arranged in a predictable and 
systematic manner, and a good index to the class scheme (provided by the subject headings in the 
catalog), then we have the “corkscrews" we need to get the wine out of the bottles. I don’t need subject 
expertise myself, in other words, as long as I can rely on the products of the subject catalogers’ thinking, 
analysis, and expertise being accessible in a routine and predictable manner. 


Cataloging Quality, LC Priorities, etc. page 3 


I am going to argue--with solid reasons, I believe--that the crucial intellectual structure of 
categories and linkages among the categories that is created by our catalogers is now being undercut 
by two factors: 1) a concern for economy that is making cuts according to wrong priorities (i.e., 
wrong when measured according to the criterion of the distinctive feature of LC’s mission), and 2) 
a mistaken assumption that the "computer workstation model of the future” can substitute raw 
computer retrieval power (applied to unstructured data) in place of the intellectual structure that 
reference librarians and researchers now rely on. 


I will be talking primarily about subject cataloging; but before going into that I must 
emphasize that I do not mean to slight the wonderful work done by descriptive cataloge:~. It, too, 
routinely solves problems that simply cannot be handled by computer keyword access. For example, 
I've helped more than one reader who wanted information about Libya’s Muammar Qaddafi. The 
problem, of course, is that the relevant literature exhibits nearly 50 different spellings of his name 
(e.g., Moammar E] Kadhafi, Moamar al-Gaddafi, Moamer El Kazzafi, Mu’Ammer el Qathafi, etc.). 
Without the standardization of all of the records under one form, the information would be scattered 
to the winds. Key word searching cannot retrieve it, because the key words are widely--even 
wildly--variant from one record to the next. 


Similarly, even titles that aren't standardized by descriptive catalogers will wander all over 
the alphabet--e.g., even one as seemingly simple as "Hamlet" proves to be amazingly slippery in the 
editions that actually get published: 


Amleto, Principe di Danimarca 
Der Erst Deutsche Buhnen--Hamlet 
The First Edition of the Tragedy of Hamlet 
Hamiet, A Tragedy in Five Acts 
Hamlet, Prince of Denmark 
Hamietas, Danijos Princas 
Hamleto, Regido de Danujo 

The Modern Reader's Hamiet 
Montale Traduce Hamlet 
Shakespeare's Hamiet 
Shakspeare's Hamlet 

The Text of Shakespeare's Hamiet 
The Tragedy of Hamlet 

The Tragicall Historie of Hamiet 
La Tragique Histoire d"Hamiet 


And still other editions appear in non-roman scripts such as Greek, Hebraic, and Cyrillic. 
The work of the descriptive catalogers, however, assures us that all of these forms will be grouped 
together in one place under a uniform title. The category they create in loading such disparate data 
under one form, in other words, infinitely simplifies the work of retrieving it. 


Cosposate same changes are ako notoriously slippery, as in the variant forms of one 


page 4 Cataloging Quality, LC Priorities, etc. 


This kind of thing happens all the time--especially with U.S. government agencies--and we 
simply cannot retrieve the bulk of relevant records by mere key word searching. We need the 
standardization and cross-reference structure provided by authority work. 


I cannot emphasize this enough: what we need for efficient retrieval is the intellectual work 
of cataloging, not just the transcribing of key words. 


And it is precisely this intellectual structure that, more and more, is being undercut these 
days. 


But let me return to subject cataloging. I want to stress this because it tends to get the short 
end of the stick around here. I think, however, that the ugly duckling we so often mistreat is actually 
the swan that should be the source of our greatest pride. 


Specifically, because of the subject catalogers’ work, reference librarians who have no subject 
expertise themselves can nevertheless reasonably predict the following about the systems of access 
that will be available in any subject area: 


e That the principle of uniform heading will be effective--i.e., that we won't have to 
think up all possible synonyms (e.g., capital punishment, death penalty, legal execution, 
etc.) or variant title key words (A Life for a Life, To Kill and Be Killed, The Ultimate 
Coercive Sanction, etc.) that are scattered throughout the alphabet. Rather, we will get 
all of the variant forms grouped together for us under the one approved LCSH term 
[Capital punishment]; 


e That the uniform English language heading will include in its coverage all of the 
foreign language books on that subject as well (e.g., works on pena capitale, peine de 
mort, smertnaia kazn’, todesstrafe, etc., will be grouped together under the very same 
heading (Capital punishment) that rounds up all of the English variant phrasings); 


e That the principle of specific entry will be effective--i.e., given a choice of possible 
search terms, we will predictably get much greater relevance in our retrieval by using 
the more specific rather than the more general headings (and knowledge of this 
principle is a major advantage in systematic searching); 


e That we can find the specific heading(s) via a systematic network of cross-references 
(i.e, we do not have to guess which terms to search under); 


e That we can snag still other relevant category terms that are outside the cross- 
reference network through the use of the subject tracings that are attached to the 
records discovered through title or author searches (tracings, in other words, tell us to 
which categories any particular known item belongs); 


e That we can sharpen and focus questions that are fuzzy to begin with--a frequent 
starting point for readers--by examining the array of precoordinated subdivisions under 
a heading, that spell out and distinguish its various aspects in ways that we couldn't 
think of in advance (i.e., the system will clari‘y our range of options for us, thereby 
enabling users to ask better questions in the first place); 


Cataloging Quality, LC Priorities, etc. page 5 


e That the study of the list of possible standard subdivisions can greatly aid reference 
librarians in helping readers, as it gives us a foreknowledge of the types of questions that 
can readily be answered, over and above giving the readers an ability to recognize 
options that they did not know in advance; 


e That the catalog will usually not indicate contents of individual chapters of books; and 
because of this we can have realistic expectations of what the catalog will mot do, and 
turn instead to sources such as Essay and General Literature Index or Index to Scientific 
Book Contents for this kind of information-or to the classified array of full texts 
themselves, in the stacks (see below); 


e That we can bring directly to bear on a question a variety of types of literature (e.g., 
dictionaries, handbooks, concordances, directories, etc.), each tailored toward answering 
certain categories of questions, by quickly and easily identifying the “type” of reference 
sources through form subdivisions in the catalog; 


e That established LCSH category terms can be used in scores cf sources outside our 
own LOCIS system, i.c., in a wide variety of commercially-produced indexes, databases, 
special collections, etc., etc.; 


e That because so many commercial publishers piggyback on LCSH, and because we 
can plug the same term(s) into so many different disciplinary and format indexes, we 
can get an entire range of cross-disciplinary perspectives on the same subject with surprising 


e That Boolean search capabilities are rendered enormously more powerful when 
subdivisions can be searched for separately and used as sets themselves, for combination 
with other sets; 


e That key word searching is also rendered much more powerful when we can search 
the component words of LCSH terms, which are extra elements added to the records-- 
not transcribed from the books -assigned by thinking professionals who notice and define 
categories and relationships that are not defined by title or note field words; 


e That, because of the routine assignment of a standard subdivision, we can easily and 
systematically identify published subject bibliographies, whose contents are often far 
superior to any bibliography that can be generated from the LOCiS system itself; 


e That many subject bibliographies can be easily identified directly in the class scheme, 
even without reference to the catalog, because of their being clustered in the Z classes 
(e.g., major bibliographies on individual people being arranged alphabetically by 
surname in the Z8000s, other subject bibliographies and indexes being arranged by 
continent/country in Z1202-4980 and alphabetically by subject in ZS051-7999); 


page 6 Cataloging Quality, LC Priorities, etc. 


e That extremely specific information--much too narrow to be included in any catalog 
records, even those with abstracts or added tables of contents--can still be found through 
the examination and browsing of full texts, provided the texts are arranged in subject 
groups to begin with (rather than being in MLC); 


e That the scattering of the various aspects of a subject (e.g., philosophical, religious, 
psychological, biographical, historical, social, economic, political, legal, educational, 
musical, artistic, fictional, poetic, scientific, mathematical, technical, military, 
bibliographical) in the class scheme is corrected by the controlled vocabulary subject 
headings in the catalog, which group together in one place the various aspects that are 
scattered on the shelves, and which display those aspects in relationship to each other 
(often througi: subdivisions of headings); 


e That because the LCSH headings (often with precoordinated subdivisions) group 
together in the catalog those subject aspects which are scattered in the stacks, the 
heading terms on records in the catalog thereby function very efficiently as the index to 
the class scheme, i.c., they show that the same subject can appear in many different 
classes; they show the full range of which classes, specifically, would be most promising. 
for full-text browsing; and, via the distinctions pointed out by precoordinated 
subdivisions, they show which subject aspects are located in which areas of the stacks. 


e That, in addition to grouping under one subject heading works on a subject that are 
scattered in many different classes, the catalog also provides mulnple access points (via 
author, title, various subject headings and numerous cross-references) to each book in 
the class scheme, in which any work can appear in only one location; and that this, too, 
functionally makes the catalog an excellent index to the classification scheme; 


e That the classified arrangement of full texts will enable us to recognize useful 
ee oF Cr te eee ae Se Se OO Ry > See 
the catalog; 


e That either of the two systems--LCSH headings in the catalog (i.c., not just in the red 
books) or full texts arranged in the LC class scheme--will enable us to recognize subject 
information that cannot be reached through the other system alone (or, for that matter, 
through key word searches); 


e That foreign-language books in the stacks are fiudable in the very same category 
groupi:gs that hold the English books (just as, analogously to what was described above, 
the subject groupings in the catalog also include foreign titles); 


e That because of the multilingual and international groupings in both the 
LCSH-arranged catalog and the class scheme, scholars at LC--unlike researchers at any 
other national library--can find both sources of primary imrortance to their subjects and 
the supporting literature easily and in a systematic fashion; 


e That sets created by online retrieval which are too large--a frequent problem--can be 
cut down to more manageable size relatively efficiently by means of limit comma~aads on 
fixed fields that aren't even displayed to users (e.g., limits can be done by language, by 
presence of a bibliography, by geographic area code, etc.); 


Cataloging Quality, LC Priorities, etc. page 7 


e That mullions of full texts can be examined systematically in predictably narrow groups 
in the class scheme--much more than can be searched frx ia even the very largest full- 
text databases; 


e That full texts in the class scheme can be searched systematically without the need for 
the Library or the individual researcher to pay stiff copyright royalty or licensing fees to 
anyone (NEXIS, by contrast, costs approximately $3 per minute to search, and its 
contents are submicroscopic in extent compared to the 535 miles of texts at LC). 


When these predictable features are not available--as in Minimal Level Cataloging, which 

destroys all of this predictability--the alternatives that are left for reference librarians are to do 

word searches of titles (or note fields, if they even exist [and they do not exist in either 

MLC or PREMARC)) with Boolean combinations. These alternatives are, of course, necessary on 

many occasions, and they are excellent complements to a vocabulary-controlled catalog and a 

ee ee cree en ee Say OO CY OF ee 
these systems. 


What is disturbing to reference librarians is that there is a noticeable drift at LC away from 
a proper appreciation of cataloging structure. It is all the more disturbing because LC, being the 
source of that structure, plays a major role in determining the quality of research that can be done 
throughout the entire nation. 


There seem to be at least two different factors involved in this drift. 


The first factor, obviously, is economics. In times of budgetary problems the whole Library 
faces cuts. This is simply unavoidable. Yet, given the extremity of some of the cutbacks in the 
quality of cataloging (e.g, MLC), one cannot help wondering if everyone is aware—both in 
Collections and Constituent Services--of the extent of the damage this does to our very mission in 
making knowledge records accessible. Simply having books available on shelves in an accession- 
number arrangement, or having a virtually unfindable, skeletal record in the computer, does not 


Now, we've all heard that “some record is better than none,” but let's take a deeper look at 
this. When we do subject searches in the reading rooms--by far ine most frequent type of inquiry-- 
we look first for LCSH terms. And when we find them we do not then go on to also search every 
key word variant we can think of--which is what we would have to do to get MLC books. The whole 
purpose of having a vocabulary-controlled system in the first place is so that we won't have to puzzle 
out synonyms and variants. When cither reference librarians or readers get good results with LCSH, 
the strong human tendency is to stop there and go with whatever is in hand. If extraordinary extra 
efforts have to be taken to find the MLC books that aren't in the subject categories, then those 
books wiil simply stay unfound. 


I don't mean to offend anyone on this, but I think that people who casually assert that “some 
record is better than none"--as though this situation were a solution to the actual problemas of 
reference--display a questionable assumption about how reference really works. Minimal Level 
Cataloging takes books out of both of the systems of access that reference librarians rely on. MLC 
records aren't grouped together in categories in the catalog, revealing their relationship to other 
works on the same subjects; nor are the terms needed to retrieve them determinable in a systematic 


page 8 Cataloging Quality, LC Priorities, etc. 


and predictable manner via cross-references and tracings. Nor are the full texts of the booss 
themselves browsable in the stacks, because they aren't shelved in the subject groups. While the 
technique of key word searching of skeletal surrogate catalog records provides an avenue of access 
to them, it is not a system of access because it lacks the crucial components of predictability and 


serendipity, and beyond that, in preventing access to full texts in a systematic manner, it lacks depth 
as well. 


With MLC records, we cannot tell in advance which words to type in even within one 
language, let alone in over 450, since there is no vocabulary control; and if the terms we guess at 
are even slightly off then we will miss whole groups of relevant records without knowing that we've 
missed anything. 


The problem is further compounded by the LOCIS software's inability to do word 
truncation--ie., if we miss the English-language title words (which have problems of their own with 
plurals and possessives) then we are thrown into the arena of books in ail of the other 450 languages 
that scholars come here to retrieve. And there, too, we have to specify im advance not only all of 
variant words for a subject, but also all variant forms of each word--but now the problem becomes 


geometrically complicated by inflected languages (e.g, anthropos, anthropou, anthropoi, anthropon, 
anthrope, anthropois, anthropous), which comprise a large percentage of our collections. (Remember 
that the English language books are only about a quarter of our book holdings.) If the records had 

’ tell which terms to search (and in only the one language, since the English headings round 
up all of the foreign books as well) and which form of the terms to use. We could find whole 
categories all at once, in other words, not just scattered and isolated individual titls. 


MLC lacks not only this predictability but also the aspect of serendipity. By this I mean that 
with MLC we cannot simpy notice books--or even surrogate catalog records--that are related to the 
ones we find by other means; we have to specify in advance everything we want to sec, and the 
specification must be precise and complete. Anyone who has ever done any library research knows 
how inefficient such a system is--the ability to recognize works that could not be specified in advance 
(either in the catalog or on the shelves) is a crucial part of the process; and it is utterly lost with 
minimal level cataloging. 


MLC also preciudes depth of access to the collections because it takes books out of the class 
scheme, which is the primary means by which we have systematic access to the tables of contents, 
chapter headings, index2s, illustrations, etc. etc. (We have three regular researchers from the 
Oxford English Dictionary who routinely use the classified arrangement of full texts to locate specific 
appearances of individual words. Their assignments are sent over from England every week because 
British researchers don't have such good access in their own country's research libraries.) The only 
thing searchable in MLC is a skeletal catalog record; there is no access to full texts in browsable 
groups. 


What I am concerned about, then, is the naiveté about reference that apparently lurks 
behind the assumption that MLC books, with “some” record, are “still” accessible. In the large 
majority of cases they simply are not. (Again, if the books needed to answer the six questions at the 
beginning of this paper had received MLC treatment, they would not have been found.) Most of 
the information in them is just plain lost because the books cannot de recognized in contiguity with 
other, similar works. If we do not already have an author's name (and we very seldom do), the 
obscure foreign language works aren't findable by prior specification--ie., if they aren't easily noticed 


Cataloging Quality, LC Priorities, etc page 9 


in association with something else, as part of a category, they aren't found at all. (To focus on just 
one of the six examples, I never would have found the date of the Novgorod monument if the 
existence of the Russian booklet with the answer hadn't been brought to my attention by the 
operation of both catalog systems. And yet it is precisely this kind of work that will now wind up 
in MLC. Thirty years from now, when the Librarian of Congress--whoever she is--wants the date 
of a memorial in the independent Republic of Armenia, the reference staff won't be able to find it 
if it's in MLC. The information sitting on the shelves will be lost because we've cut off the systems 
of access fo it.) 


The assumption behind MLC seems to be that people will actively search for its records or 
take extra steps to find them. But the reality of reference is that they will do no sucié thing As a 
general rule, they will find--and settle for—whatever we make it easy for them to find. (The principle 
of least effort is a reality--and, yes, even with “experienced” scholars.) 


Perhaps an analogy would be useful [I've already mentioned that having both a 
vocabulary-coatrolled catalog and a classified array of full texts gives us the “corkscrews" we need 
to get the wine out of the bottles at LC. Having only MLC access, in contrast, is like having an 
icepick imstead of u corkscrew--it enables us to punch only small holes through the cork. and to 
“spritz” out only a few individual drops at a time. The wine cannot flow out through such small 
holes. And while the hws of physics may tell us that technically it is possible to empty the bottle 
through such holes, the principle of least effort tells us at the same time that virtually no one will 
boxher to do it when there are other bottles that can be opened easily. (Note, however, that the 
other bottles will not have the same content--nevertheless, they wili be used regardless. It is nov the 
quality of content in a source that determines its use-rather, it is case of access.) And it bs 
irresponsible to blame on ‘readers’ laziness” what are actually faults in the Library's system design. 
A workable system must take into account what we already know about what can reasonably 


be expected of users’ information-seeking behavior. 


Perhaps we need to take another look at priorities, especially for subject cataloging. Perhaps 
the balance between Descriptive and Subject catalogers could be looked at. Perhaps some of the 
descriptive work can be cut back so that more of the subject work can be preserved (¢.g., as painful 
as it would be to reference, I'd rather see cuts in time-consuming s«ries authority work than have 
more books go into MLC and thereby lose so much subject access). 


Perhaps most of all we need to reduce the priority of the book backlog problem. Perhaps 
the pressure needs to be removed from the managers’ backs to make so many cuts in the quality of 


the book cataloging. Perhaps attention could be focused instead on the noabook formats, where 
collection-level treatment offers a hope of generating large statistics relatively quickly. Perhaps, 
following a lead suggested by a P&P colleague, we really need to declare some resources as “attic 
collections” that have already received all the treatment they are ever going to get (barring a 
budgetary miracle, or volunteer work by retirees or other concerned supporters). We are not the 
Only large library with a backlog; further, we do not need to consider ourselves to be in a race with 
anyone. 


to show a bit more courage in standing vn for the value of the work done by our catalogers. 
Perhaps we all need to keep more clearly in mind why it is that we are dowg the cataloging in the 
first place (rather than just transcribing key words). If a new reliance on MLC as an “acceptable” 
solution to the book backlog problem turns into an habitual disposition to think that full cataloging 


page 10 Cataloging Qualuy, LC Pnonnes, etc. 


was never really important to begin with, then we will truly have jost much more than we've gained. 
For the sake of a quick fix to one problem we will have killed the goose that lays the golden eggs. 


I mentioned that there seem to be at least two factors involved im our drift; one impetus 
behind our corner-cutting is, obviously, economics. The other--which buit up 2 strong head of steam 
under the previous administration--is a fundamentally different conception of what the future of the 
Libra.y ought to be. In a nutshell, it is that the primary goal we ought to work towards is to have 
“everything in the computer” so that any scholar at any “workstation” can call up “the whole Library 


of Congress.” 

I'm going to have to oversimply a bit here, and pass over some qr...’ 3 im order to 
make this point stand out in relief--because the point does have to be m= .. and ™) presenting it 
in an extreme form here is, unfortunately, not all that different from the wa, _. '**’_ >*en presented 
by those who believe it. 


The point, then, is this: The “computer workstation” model tends to entail two concealed 
propositions: 1) that the new avenues of access to catalog surrogates provided by computerized key 
word and Boolean search capabilities are making vocabulary coatrol and authority work unnecessary, 
and 2) that the technical capability to mount full texts online or in CD-ROMs makes irrelevant the 
continuation of a classification scheme for printed books. In other words, the retrieval capability of 
the computer is seen to be “so powerful” that any records within it-whether catalog surrogates for 
full texts, or the full texts themselves--no longer need the standardization, categorization, and 
integrative relational linkages provided by the work of professional cataloger3. 


(The oversimplification | spoke of lies in the connection between the computer workstation 
model and these two propositions. This model, in reality, does not have to entail either or both [that 
would be ideal}. But the fact is that it usually does entail both; and this version of the mode! will be 
my concern here because of its growing prominence at LC.) 


That the first proposition has secured much more than just a toe-hold at LC--at many levels, 
and in both Collections and Constituent Services--has already been demonstrated a number of times. 
Its most virulent outbreak occurred a few years agc, under the previous administration, in the 
attempt to replace the Main Card Catalog with the PREMARC (PREM) database. There were 
several issues involved in that debate (and for purposes of brevity here | won't go into the side issue 
of the KG. Saur microfiche of the catalog), but the most important one was not that of “cards vs. 
higher technology”--as it is still misperceived to have been in some quarters. 


Rather, the most important issue was precisely one of structured access to catalog records 
in a system of vocabulary control and authority work, versus an unstructured avenue of access via 
uncontrolled key words in postcoordinate Boolean combinations. The PREMARC database does 
not follow the principle of “uniform heading’-or even come close to it. It bas all of the thousands 
of obsolete headings of all of the first 9 editions of LCSH (in addition to multiple misspellings of 
most of them)--i.c., unlike the card catalog, it does not have only the 9th edition forms; rather, it has 
all outdated headings from ail previous editions as well. (For example, PREM includes three 
different headings for the Second World War, each of which was valid at one point in the catalog's 
history: “European War, 1939-[1945]," "World War, 1939- ,” and "World War, 1939-194S"--plus 
hundreds of miskeyed and misspelled variants; it also has three different category terms for 
computers, cach of which was also valid at one time: “Computers,” “Electronic calculating-machines,” 
and “Electronic differential analyzers"-again, with multiple misspellings. And the mullons of 


Cataloging Quality. LC Pnonnes, etc. page 11 


miskeyings in PREMARC undermine the vocabulary conirol to sich an extent that they are just as 
big a problem as the presence of so many wrong subject headin:;.) 


Moreover, PREMARC entirely lacks an integrative cross-reference structure, either for 
the once- valid but now obsolete subject terms or for corporate bodies and other names, which have 
alo undergone many significant changes over decades. (The cross-references in the 9th edition of 
LCSH fail to turn up the earlier forms.) 


In PREMARC, then, the wtellectual structure of the Main Card Catalog that was so 
laboriously created over eight decades by hundreds of professional catalogers, and at a cost of 
millions of dollars to the taxpayers--the predictable structure that reference librarians are so very 
dependent on--is simply gutted. Neither readers nor reference librarians can find the proper 
headings in systematic ways because the current cross-reference structure does not work with 
PREMARC records, nor are the tracings on the catalog records in it reliable as uniform headings. 


Nevertheless, the unargued assumption of the previous administration was that the 
computer’s component word and Boolean search capabilities "made up" for the gutting of the 
intellectual structure of predictable categories and systematic linkages among the categories. This 
assumption is demonstrably false to anyone who would care to actually use the database; but 
decisions were not being made on the basis of evidence--only on the basis of the unexamined 
presupposition that any computerized format must be better than a manual one, regardless of 
profound differences in content and structure. 


The ultimate reversal of the decision to throw away the card catalog came about as a result 
of 150 professional reference librarians from six different divisions--including CRD--signing petitions 
virtually demanding its retention until such time as PREMARC is cleaned up. (And the estimate 
there is that a full cleanup of the database would take 40 people working full-time for 23.3 years). 
In retrospect, however, one gets the feeling of having been only an Horatius at the bridge impeding 
the attackers’ advance, rather than being an evangelist successful in converting them, for the same 
assumptions continue to threaten the intellectual structure of the catalog system. 


For example, just to stay with PREMARC for a moment, we are now embarked on a course 
that will make this database much more attractive to use than the card catalog. We are planning to 
put it into ACCESS, the touch-screen menu system that is intended to enable readers to bypass 
consultations with reference librarians. And, in effect, we are thereby significantly tilting the entire 
slope of our retrieval system for the older books in our collection. We are, in effect, making it much 
easier for researchers to miss the right category terms for their subject. We are making it much 
easier for readers to miss linkages to subjects that they hadn’t anticipated. We are making it much 
easier f .r readers to assume that we do not have relevant works on their subjects, and to give up 
their searching prematurely, in the mistaken belief that the computer has given them "everything." 
We are making it much easier and more attractive to do shoddy historical research. We are utterly 
ignoring the reality of the principle of least effort and the need to integrate it into system design; 
instead o1 sloping the system to make the high-quality retrieval system the first point of entry to the 
older books, we are encouraging readers to use a poor-quality, unintegrated, and superficial source 
instead (and we are doing it in spite of the abundant literature indicating that most of them simply 
will not "bother" to check the cards as a backup.) 


page 12 Cataloging Quality, LC Priorities, etc. 


What is especially noteworthy is that the ACCESS version of PREMARC doesn’t even allow 


the key word and Boolean search capabilities that are supposed to “make-up" for the loss of vocabulary 
control. 


And why are we doing all of this? Because PREMARC in ACCESS fits into the "computer 
workstation" model envisioned for the future of the Library; and, obviously, the Main Card Catalog 
does not. Our move in this case away from quality cataloging is certainly not necessitated by 


The assumption that any computerized retiieval is sufficient for researchers shows up 
elsewhere, too. I've already noted its other most virulent form, that MLC is an acceptable "solution" 
to the backlog. 


Perhaps this needs to be said: part of the very mission of the Library is io facilitate not just 
the collection but the integration of knowledge; and it is precisely this integration that is destroyed by 
MLC--and by PREMARC, too, though to a different degree. Both treatments take books out of the 
integrating system that enables then. to be perceived and retrieved in association with other, similar 
books. 


The casual assumption that key word/Boolean access is sufficient even for retrieval--let alone 
intellectual integration--leaves the trace of its presence in other areas as well, e.g., in the Library's 
original plan to do copy-cataloging: we were going to accept LC subject headings from other 
libraries, but we were planning not to shelve the books themselves in subject groups according to 
copied class numbers. The reference staff complained forcefully enough, through the Cataloging 
Forum channel, that this potential disaster was averted--but the odor remains: why did we assume 
in the first place that computerized catalog records alone were adequate for retrieval, without the 
second half of the catalog system (the subject-browsable array of full texts on the shelves)? Could 
it be that since the access system made up of printed texts (i.e., arranged in browsable subject groups 
on the bookshelves) cannot fit into the computer workstation model, it is therefore considered to 
be unimportant? 


Apparently, too, there is now some significant questioning of the value of standard (or other) 
subdivisions in pre-coordinated relationships with LCSH headings, and that we should “simplify” their 
assignment. (Are they necessary? In a word: YES/) And there is also now, already, a greater 
po peti mecterragen = depen ar naps pinned See yh peta edhe ye 

and cross-references to extend the vocabulary-control network. Both of these "developments" can 
cause long-term damage because not only individual book records are affected--the intellectual 


structure of the overall catalog system itself is straightjacketed. 


Now I know there is a tendency to regard such practices as acceptable as long as we don’t 
receive “complaints from outside libraries" (who, conveniently for such a rationale, are not in a 
position to notice a change in the pattern of the forest on the basis of seeing only a few of its trees). 
But, really, don’t we need to show some leadership in the library field here? Don’t we need to hold 
ourselves to a higher professional standard than just, in effect, whatever we can get away with 
without being observed? Perhaps we need to remind ourselves of what ought to form the basis of 
our assumptions: that the purposes of cataloging (e.g., to provide predictability, systematic serendipity, 
and depth of access) need to be our criteria of judgment; that the hooks which continue to come in 
at the rate of over 500 per day are not becoming one bit less original, less ground-breaking in new 
subject areas, or less complicated in their relationships to other books--and that if the cataloging of 


Cataloging Quality, LC Priorit.zs, etc. page 13 


the books is to continue to perform its crucial integrative function, it has to be an accurate reflection 
of the originality and complexity of the literature itself. 


Additionally, there is the economically-induced and apparently unavoidable need to accept 
copy LCSH-cataloging from other libraries--granted, it’s better than key words alone, and also 
granted, we’re forced to accept it; but didn’t we learn from the COMARC project in the 1970s just 
how badly many other libraries--including many of the "good" ones--assign LCSH terms? (Again, 
subject cataloging is not merely a matter of assigning headings; it is a matter of also maintaining the 
network of cross-references and relationships among the headings.) In other words, our rvsh to 
embrace other libraries work should not be untempered by our own past experience of the average 
quality of the copied records; and it is a matter of genuine importance whether the records get 
reviewed by subject catalogers in addition to CM&P staff. And for that matter, it is also a matter 
of genuine concern whether or not the books in the "editions experiment" get reviewed by subject 
catalogeis. (The automatic acceptance of old subject headings--even those from PREMARC!--for 
new editions of books is simply not a wise practice.) 


So far I’ve been talking mainly about the noticeable slippage in the last decade in our assumption 
of the basic importance of the LC Subject Headings system, and that this drift seems to be towards 
accepting the notion that uncontrolled key word searches with Boolean combinations are acceptable 
as replacements for, rather than as supplements to, LCSH. That’s bad enough--indeed, if outside 
libraries were to get a clear sense of this drift, and also were to suspect that such changes were 
approaching some kind of critical mass, then I suspect the flap we went through over MARC 
licensing would be tame by comparison. (And it is not an encouraging sign when 81 percent of our 
own professional subject catalogers have recently signed a petition to the Librarian, objecting to the 
“permanent decline in cataloging quality” that they foresee under their new organizational structure. 
One hopes that their concerns can actually be addressed rather than dismissed as “resistance to 
change." The quality of our catalogers’ work directly affects such outside groups as ALA, ARL, and 
OCLC--all of whom are not without clout in dealing with Congress. To prevent their intervention 
is in our own interest; and therefore we need first to refresh our own understanding of the value 
of the work that our catalogers do.) 


But there’s even more in the wind--as already hinted at, the drift also includes an apparent 
questioning of the need for the LC classification scheme, the second component of the systematic 
intellectual structure defined by our cataloging. (The belief that subject classification of print-format 
books is no longer important is the second concealed proposition in the computer workstation 
model.) 


In 1982, in his book on the Library, Charles Goodrum stated the matter baldly: 


.. It is easy to see that the traditional answers to the traditional questions fall apart with videodisc 


technology--or at least traditional cataloging techniques do. If every disk has some 3,000 books on 
it, we can no longer shelve “books” by call number and keep inserting the latest volume in among 
its subject peers. Subject classification by class number on open shelves for browsing can be 
forgotten. The present guess is that disks will be loaded in the manner of a bookstore: all fiction 
on one “wall,” political science on another, cookbooks, history by country, psychology, children's 
books, biology, et cetera. Within each broad category, it is assumed that books will be stored by 


an acquisition number ... 
Eight years ago, when our rose-colored glasses were still on, we regarded full-text databases 


almost exclusively as technological concerns--i.e., the only important impedimeats to progress were 
technological in nature, and once these problems were solved, then everything else would inevitably 


page 14 Cataloging Quality, LC Prionities, etc. 


“come around.” Since then, however, we have learned the hard way that the matter is not simply 
a problem for engineers. There are also legal and economic problems that simply will not go away. 
So intractable are these problems, in fact, that the much-publicized digital optical disk terminals were 
entirely removed from the public reading rooms in April of this year; and the system has now been 
confined to CRS’s use. The reasons that were not apparent in 1982 are inescapable now: 


I) The copyright law continues to show stubborn resistance to change. Because of the 
the digitization of texts is not a mere copying of them (as with microfilm); it is much closer 
to republication of them, with all the attendant problems of intellectual property rights, fee 
requirements--and susceptibility to large lawsuits. 


2) The digitizing process for texts is itself very expensive--more so than we anticipated, even 
for raster-scanning. (OCR scanning, which requires extensive human editing, is even more 
costly.) The realities of the federal deficit, of the Savings & Loan debacle, and of 
Gramm-Rudman have all had--and will continue to have for many decades--major restricting 
influences on the availability of federal funds. Over the long haul, such realities cannot help 
but prevent anything but token efforts on LC’s part toward digitizing the full texts in our 
collection. And any money that is spent for this purpose is money that is not spent for 
cataloging the print-format books in our backlog. 


3) The National Research Council, the National Archives, and LC’s own Preservation Office 
(among many others) have all concluded that electronic formats--including optical disks--are 
not acceptable for long-term preservation of knowledge records. Again, the problem is not 
merely a matter of technology--for one thing, the necessary electronic transfer of data from 
one generation of media to another, repeatedly, will be very expensive. And the problem is 
further complicated by the fact that producers of information in such formats are already 
demanding payment of copyright royalty fees when libraries seek to make such generational 
transfers (cf. AL, 3/90, pp. 194-95). Another problem is that market forces rather than 
standards will ultimately determine a medium’s longevity; and market forces can cause 
standards themselves to become obsolete. (Anybody remember 8-track tapes, which were 
manufactured compatibly by several different companies?) 


While it gives us some favorable publicity to be able to point to CRS’ s use of the digital disk 
system, we nevertheless, at some point, have to consider substantive issues as well as appearances, 
and face the reality that the CRS example cannot be generalized to the rest of LC because 1) 
Congress has entirely exempted this particular use of the disk system from the Copyright Law--which 
it is not going to do for any other application; and 2) CRS is interested in fast access to current 
information--an appropriate use for the disks--but simply does not consider the long-term archival 


preservation of material to be high on its priority list. That’s something for the rest of the Library 
to worry about. 


In short, then, we have to take any prediction of the Library of Congress being composed 
exclusively or primarily of digitized electronic full texts as a very serious mistake. Certainly such 
texts will be--and should be--one component of the Library. But we’ve already learned the hard way 
that they cannot be the entire thing. 


Or have we learned that lesson? 


Cataloging Quality, LC Priorities, etc. page 15 


It is most gratifying to see that the Librarian himself has learned it, as his recent 
pronouncements on the American Memory project explicitly emphasize its purpose to draw readers 
into libraries, as places, to begin with, and to start them on research projects that will lead them to 
books. This is a far cry from the extremist position of the previous administration, which became 
so infatuated with the ideal of the "independent computer workstation,” at which any scholar at any 
location could call up "everything" or “the whole Library of Congress." Such a digitized full-text 
library of “everything” obviously would not need a classification scheme--for the reasons given by 
Goodrum. Further, as we saw in that administration’s attempt to substitute PREMARC for the card 
catalog, it was also considered fully acceptable to view a key word/Boolean search avenue as a 
replacement for the vocabulary-controlled system (with predictable subject headings, a network of 
cross-references, and reliable tracings), rather than merely as a supplement to it (which has always 
been the view of the reference staff). 


Unfortunately this “world view" of the future of the Library (as being that of a distributed 
series of workstations) gained so much currency under the previous administration, and so many 
careers were formed in promoting it, that it is still very much with us: 


e Our “video tour" of LC that we show to visitors in the Madison Building continues to extol 
the virtues of optical disks as the library of the future (with no mention of the embarrassing 
fact that the system has already been removed from the public reading rooms)--and repeats 
the proclamation to all comers every half hour all day long. 


e Our own Planning Office continues to assert that "Libraries of the future will be a service, 
not a place. What you once had to go to the Library of Congress to see can be viewed at 
home" (Wall Street Journal, 2/23/90). (Questions: Will the full text of the 19th-Century 
Russian booklet with the date of the Novgorod monument be included in the system? Will 
all of Sample Case be in there too? And if it’s going to take us decades--more likely 
generations--of very large expenditures just to get full catalog records into PREMARC, where 
is the money going to come from to go beyond that for full texts?) 


e Two ITS speakers, at a staff briefing in the Mumford Room on 4/18/90, glowingly 
described the rewiring of the Main Reading Room to prepare it for just such workstations: 
“in the Reading Room you do not read books; you read terminals." (Questions: When 
readers are already vociferously complaining about the noise made at the desk next to them 
by "silent" PCs, will their complaints stop when there is a keyboard at every desk? If we use 
the little headband screens to display information, do the ITS speakers wish themselves to 
wear the same headband just used by some of the more unwashed denizens of the reading 
room? And when the readers inevitably have numerous questions about the operation of 
the system--questions that no menu in ACCESS can answer--will it be conducive to a reading 
room to have reference librarians continually talking to the patrons at their desks? Has no 
one noticed that, unlike the arrangement of ITS itself, MRR is not partitioned into 
acoustically shielded workstations? and that to set up such barriers would destroy the 
expensive “restoration” of the room? Would the ITS speakers want to have their own 
partitions removed in exchange for “white noise"? Will “white noise" mask the sound of a 
keyboard right next to a reader in MRR? In short, isn’t it just possible that there are other 
considerations besides the technology of the room’s wiring that are more important to the 
function of the room? Or is the use of new technology simply an end in itself that should 
set the agenda for everything else? Shouldn't we be using the technology, rather than vice- 
versa?) 


page 16 Cataloging Quality, LC Priorities, etc. 


e A formal presentation to the Madison Council by Constituent Services personnel on 
4/24/90 confirmed this plan for MRR, including such statements as "In the Main Reading 
Room of tomorrow, equipped with small terminals at the readers’ desks, we envision an 
online information service, entered through ACCESS. This service will include a 
comprehensive online reference library of full text materials ... The goal is to create a single 
source for information, available from a single place--the computer terminal sited at each 
patron’s desk in the Main Reading Room." (Questions: Won't writers of reference books-- 
and I’m one of them--have intellectual property rights in the future? And since many of the 
full texts already in our Machine-Readable Collections Reading Room require extensive 
support documentatic? in paper format in order to be loaded and searched efficiently in the 
first place, why won’t the MRR of the future require as much staff assistance to patrons at 
the workstations as the current MRCRR does right now? And even if we concentrate on 
loading digitized texts that are copyright-free [assuming, somehow, that people in MRR need 
lightning-fast access to material that is more than 75 years old], will the support 
documentation itself also be copyright free? And if we can’t load it online through 
ACCESS, then we do have to have noisy librarians stationed by the desks--don’t we?) 


Frankly, it does positive damage to the maintenance of the quality of access to most of the 
Library’s knowledge records when we try to force the entire “universe” of a large research library 
into the Procrustean bed of the independent computer workstation model. It just won’t work, and 
we Ought to have learned as much from our own experience with the digital optical disk experiment. 
Research libraries--as opposed to small, specialized libraries--will have to continue being places--with 
books!--as well as providing electronic services until such time as those 500 new books in print 
formats stop coming in every day and also until such time as Congress extends CRS’s peculiar 
exemption from the Copyright law to the rest of the nation. LC’s failure to recognize these realities 
under the previous administration has already put us in a deep hole regarding remote storage--we 
should have been planning and lobbying for it a decade ago, but we mistakenly put all our eggs in 
the optical disk jukebox basket instead. And that system has now gone belly-up for public use 
precisely because of our short-sighted refusal to look at legal, economic, and preservation realities 
lurking behind the one thing we did look at: the technology. (It is a most interesting world view that 
sees its most important problems as amenable simply to technological solutions.) 


I am not saying that we should do away with automation! An assessment of both the 
strengths and weaknesses of it, however, tends--at least around LC--unfortunately to be perceived 
as indicative of a "neo--Luddite” or “sentimental” attitude; and when a view can be thus "branded" 
it can also be readily ignored and dismissed. Such dismissal, however, will work only to the long- 
term detriment of the Library, because the non-engineering facts that we ignore in our planning 
simply will not go away; and researchers will continue to need access to knowledge records that 
cannot be found without a vocabulary-control system and a classification scheme. 


What I am saying is this: Research libraries cannot be reduced to the computer workstation 
model. We would be much better off in our planning if we used a different, and larger, model. One 
component of this larger model will be whatever can be made available at workstations--made 
available legally and economically, and with due weight being given to other systems of access that 
are equally important. (By the way, in addition to the legal, economic, and preservation problems, 
there is also a psychological impediment to putting "everything" in a workstation: it is known that 
most people can absorb only about seven, plus or minus two, options. Any menu that provides more 
alternatives than that cannot be grasped, and will result in researchers becoming “lost in 
hyperspace.") In addition to this one “workstation” component, however, we need others, too, that 


Cataloging Quality, LC Priorities, etc. page 17 


are every bit as important and that need at least equal managerial time, attention, and funding. 


One of those other components in the larger model has to be a system of access that 
provides entry into full texts--including tables of contents, chapter headings, indexes, illustrations, and 
even individual paragraphs and words--in a way that circumvents the need to pay royalty fees. 


The classification scheme is that way. It was not designed to arrange catalog cards or 
surrogate records; it was designed to arrange books so that their contents could be systematically 
examined in detail quickly in predictably limited groupings. (And even with our closed-stacks general 
rule, we frequently give stack passes to the many researchers whose particular inquiries can be 
answered only by direct browsing of subject-grouped full texts. I should note that the function of 
such browsing in a closed-stacks library is a matter that could itself be the subject of another long 
paper; and so I will not pursue it here, other than to say that it is necessary for reference work in 
many--though not all--cases.) 


But for the classification of full print-format texts to work--in order for us to find which 
classes we need to browse through--we need a good index to the class sche:ne. 


And the vocabulary-controlled, precoordinated, authority-netv-orked, and scope-note defined 
catalog of the collection is that index (in addition to serving as a vaiuable access system in itself, even 
apart from its functional link to the class system). Aspects of a subject that are scattered in the class 
scheme are grouped together, and yet displayed in an array of distinctions (via subdivisions and 
precoordination of terms), in the catalog in ways that are not and cannot be replicated by 
uncontrolled and post-coordinate key word searching. I've already given examples of this; let me 
add one more: 


Works on the Tuaregs, a nomadic people of northern Africa, cannot be systematically or 
predictably retrieved through key words, first because there are too many variant forms of 
the word itself in too many languages (e.g., Tuareg, Tuaregs, Touareg, Touraregs, Touaregue, 
Touaregues, Twareg, Touareges, Tawareks), and second because half of the books on them 
don’t even have any of these words in their titles (e.g., The Last Caravan, Sahara Toujours 
Recommence, La “Croce” di Agades, Handeln in einer Hungerkrisse, Attarikh en- Kel- 
Danneg dat Assa n-Ferensis). Also, the books themselves are scattered in several different 
class groupings on the shelves: over a dozen are in DT346 (Regions, tribes of the Sahara); 
eight are in DTS47 (Local history of French West Africa); three are in GR353 (Folklore of 
North Africa); a major 240-page bibliography is in Z3515.L48 (Bibliography: North Africa); 
and so on with a dozen different classes. The vocabulary-controlled heading in the catalog, 
however, groups all of these works together, and thereby points out ail of the different class 
ranges that hold works on the tribe. Moreover, its score of precoordinated subdivisions 
enables readers to recognize within one language (English) the extent of options and 
distinctions available within the subject: 


Tuaregs --Africa, West 


page 18 Cataloging Quality, LC Priorities, etc. 


This range of subject options cannot be noticed in the classified bookstacks, which scatter 
the various aspects without providing cross 1ferences among them. Moreover, even in the 
catalog itself readers could not recogni: such an array of distinct aspects simply from 
examining non-expressive or confusing ‘orcign language titles collocated under the broad 
heading without such English-language subdivisions. Further, the subdivisions on the catalog 
records direct readers efficiently to the various scattered areas of the classification scheme 
that contain books on whichever particular aspect of the large subject they wish to examine. 


Let’s all ask ourselves, how would LC’s (or anybody else's) reference staff have come up with 
answers to the six questions giver at the start of this paper if the relevant sources didn’t have 
findable subject headings, or if privaic publishers didn’t have LCSH to serve as the basis for their 
own indexes, or if full texts were not systematically searchable by subject in predictably limited areas 
of the stacks, or if the subject text-groupings were not searchable in a way that circumvents the need 
for paying stiff copyright royalty and licensing fees? (Note that, by contrast, most databases with 
current full text sources are extraordinarily expensive to search. Moreover, there is little inducement 
for publishers to digitize texts in subject areas that lack a market audience willing to pay such high 
inexpensive the carrier-technology may be.) 


oat isin dee et ee 
the categories--linkages that can themselves be found in a systematic manner. Moreover, the LC 
system is used by scores of commercial indexes to journal articles and other knowledge records (¢.g., 
a dozen Wilson and Information Access indexes in all subject areas, P.ALS., 

Speech Index, American Book Publishing Record, the PO Monthly Catalog, etc., etc.) that 
effectively extend the same intellectual structure far beyond the purview of our own cataloging 
efforts (€.g., to "Animal sounds" in Social Sciences and Humanities Index, to "Teachers, interchange 
of" in Education Index, and so on). 


To the extent that reference librarians have access to this structure, we can function very 
efficiently in any subject area. And to the extent that this structure is neglected or minimized (as 
in speed cataloging or the “editions experiment"), or even directly undercut (as in MLC), reference 


Cataloging Quality, LC Priorities, etc. page 19 


librarians throughout the country are left in a “universe” of knowledge that is unpredictable, 
unsystematic, unintegrated, partial, superficial, and hit-and-miss. 


Other libraries that do subject cataloging (evea the largest research libraries) are dependent 
on us for most of their records--and for the very cataloging terms that they use when they cannot 
get an existing record from us. Basically, they only supplement locally what we do here. If they had 
to undertake the fundamental work of creating the LCSH network of relationships among subject 
categories, they’d be out of business--they simply do not have the money (and they'll never have it) 
to reconstitute what would be lost if LC’s subject cataloging were not easily available. They primarily 
add furniture within the framework; we are responsible for the structural integrity and coherence 
of the framework itself. 


Given our limited resources, we need to take a hard look at our priorities: If tomorrow we 
lost the MUMS software in LOCIS and couldn't do key word searches, the problem would be only 
local and temporary. The library community as a whole is not dependent on ITS for the creation 
of software that allows component word searching of MARC records; and even if we lost such 
software at LC, we could easily rely on outside expertise, if necessary, to restore it. No one is 
dependent on LC for providing key word--or Boolean combination--software search capabilities. 


But if tomorrow reference librarians no longer had access to LC subject headings--or if we 
no longer had cross-references and scope notes leading to the right search terms--or if we no longer 
had tracings to show us which categories certain individual titles belong to--then where would we 
turn? No one else in the entire library and information field is the source of this component of the 
access system in the crucial way that LC is; and it is this component that makes the literature of the 
world, in all subject areas, findable in a predictable and systematic manner. 


If tomorrow Congress said to us, "You may no longer have funds for putting full texts in 
clectronic form, in either the SDI jukebox system or in American Memory," we would still have 
many other alternatives available for online access to full texts (e.g. UMI, Dialog, Mead Data, 
VuText, Dow-Jones, Electronic Text Corp. Disclosure, etc., etc.)--alternatives that already provide 
much more full text material in electronic form than we do or even plan to do. Moreover, most of 
the other electronic publishers provide their material in a way which allows component word 
searching that our own raster-scanning does not allow. In other words, no one is dependent on LC 
for access to full texts in electronic formats. 


But if tomorrow there were no longer an LC classification system that enables readers to search 
the vast majority of full texts that are not in electronic form (with 500 new ones added every day, 
and without any sign of print publishing abatement) arranged in a systematic, predictable, and findable 
manner, how would anyone have efficient access to the detailed information in them that can never 
be captured--or even indicated (e.g., women in Sample Case, Barataria)--in brief cataloging records? 
How would anyone have systematic access to all of the full texts that can never be mounted in 
computers because of economic, legal, and preservation (and psychological) considerations that are 
just as much realities within the total information access universe as are technical and computer- 
engineering considerations? 


page 20 “ataloging Quality, LC Priorities, etc. 


While we need to collect and make available as many electronic full text sources as we can, 
though not to the point of jeopardizing our preservation efforts, we simply cannot afford to create 
a “virtual library” in this area. Where we need to focus our creative efforts is in the area that no one 
else is covering, iec., in maintaining and extending the intellectual structure of categories of 
knowledge and of predictable systems of access to the categories. 


This structure--that we create here--is one of the chief glories of the information age and 
of world scholarship in general. Senior scholars who have used other countries’ libraries are 
routinely amazed at the range, flexibility, and depth of subject access that is possible here that is 
simply not duplicated--or even approached--in any other national library, and that spills over to 
thousands of other American libraries because of their use of our system. If LC goes ahead with 
its proposed conference to sort out “who does what" in the information age, I hope that someone 
will make it explicity, unambiguously, and emphatically clear that this is LC’s unique and unmatched 
contribution to the national enterprise: we create, maintain, extend, and stock the standardized 
intellectual gridwork that makes the literature of the world-—-in all subject areas and in all 
languages--identifiable and retrievable in a systematic and predictable manner even by people who 
are not already experts in the subjects they wish to research. This is an incredibly valuable—even 

and in times of budgetary shortfalls it must be considered a higher priority 
than the creation of full text databases. Our cataloging structure constifutes our major distinctive 
contribution to the researchers of the nation. It is the fundamental criterion by which the present 
administration's stewardship of the Library will be judged, and any model of the Library's future that 
minimizes or ignores cataloging quality is doing serious damage to our very mission. 


Cataloging Quality, LC Priorities, etc. page 21 


vr 
. 
f ,? 
3 r) : 
* 
' & 
c 
: : 
bd 2 
e . s 
. e 
e ‘0 ‘ 
y, 
Vv 
ze 
. ; 
P 
h 
1) 


FILMED 
10/12 /92 


